Automatic Terminology Intelligibility Estimation for Readership-oriented Technical Writing

نویسندگان

  • Yasuko Senda
  • Yasusi Sinohara
  • Manabu Okumura
چکیده

This paper describes automatic terminology intelligibility estimation for readership-oriented technical writing. We assume that the term frequency weighted by the types of documents can be an indicator of the term intelligibility for a certain readership. From this standpoint, we analyzed the relationship between the following: average intelligibility levels of 46 technical terms that were rated by about 120 laymen; numbers of documents that an Internet search engine retrieves using each term as a keyword from various types of websites (i.e. term frequencies). The result of the analysis shows that term intelligibility for a target readership can be estimated by regression analysis of the term frequencies weighed by the type of website. As pilot studies, we developed two regression models for estimating the technical term intelligibility for the target readership. One uses the machine learning method based on ν-SVR, and the other uses multiple regression. In order to evaluate the models, we used the results of a survey on laymen’s intelligibility levels for 50 new technical terms, and then compared the survey results with our estimated results. The results gave a correlation coefficient of 0.66 between the survey results and estimated results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spectral Features for Automatic Blind Intelligibility Estimation of Spastic Dysarthric Speech

In this paper, we explore the use of the standard ITU-T P.563 speech quality estimation algorithm for automatic assessment of dysarthric speech intelligibility. A linear mapping consisting of three salient P.563 internal features is proposed and shown to accurately estimate spastic dysarthric speech intelligibility. Delta-energy features are further proposed in order to characterize the atypica...

متن کامل

Automatic speech recognition for assistive writing in speech supplemented word prediction

This paper describes a system for assistive writing, the Speech Supplemented Word Prediction Program (SSWPP). This system uses the first letter of a word typed by the user as well as the user’s (possibly low-intelligibility) speech to predict the intended word. The ASR system, which is the focus of this paper, is a speaker-dependent isolated-word recognition system. Word-level results from a no...

متن کامل

Relating automatic vowel space estimates to talker intelligibility

Differences in pronunciation have been shown to underlie significant talker-dependent intelligibility differences. There are several dimensions of variability that are correlated with talker intelligibility including pitch range, vowel-space expansion, and rhythmic patterns. Prior work has shown that some of the better predictors of individual intelligibility are based on the talker’s F1 by F2 ...

متن کامل

DEFINDER: Rule-based Methods for the Extraction of Medical Terminology and their Associated Definitions from On-line Text

INTRODUCTION The problem addressed in this paper concerns the automatic identification and extraction of medical terms along with their definitions and modifiers from full text consumer-oriented medical articles. The system, DEFINDER (Definition Finder), uses rule-based techniques. The output of our system can be used in several applications: creation and/or enhancement of on-line terminologica...

متن کامل

Objective Estimation of Dysarthric Speech Intelligibility

The de-facto standard for dysarthric intelligibility assessment is a subjective intelligibility test, performed by an expert. Subjective tests are often costly, biased and inconsistent because of their perceptual nature. Automatic objective assessment methods, in contrast, are repeatable and relatively cheap. Objective methods can be broken down into two subcategories: reference-free, and refer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006